ARRAYED BIOMOLECUiJES AND THEIR USE IN SEQUENCING 

Reference to Related Application 

This Application is a combination in pair of PCT/GB99/G24S7, filed July 30, 

1999 

5 Field of the Invention 

This invention relates to fabricated arrays of molecules, and to their analytical 
applications. In particular, this invention relates to the use of fabricated arrays in 
methods for obtaining genetic sequence information 
Back ground of the Invention 

xo Advances in the study of molecules have been led, in part, by improvement in 

technologies used to characterise the molecules or their biological reactions. In 
particular, the study of nucleic acids, DNA and RNA, has benefited from developing 
technologies used for sequence analysis and the study of hybridisation events. 

An example of the technologies that have improved the study of nucleic acids, is 

15 the development of &bricated arrays of immob&sed nucleic acids. These arrays typically 
consist of a high-density matrix of polynucleotides immobilised onto a solid support 
material Fodor et al. 7 Trends in Biotechnology (1994) 12.19-26, describes ways of 
assembling the nucleic acid arrays using a chemically $ensitised glass surface protected 
by a mask, but exposed at defined areas to allow attachment of suitably modified 

20 nucleotides Topically, these arrays may be described as "many molecule'* arrays, as 
distinct regions are formed on the solid support comprising a high density of one specific 
type of polynucleotide. 

An alternative approach is described by Scbena et aJ„ Science (1995) 270,467- 
47Q, where samples of DNA are positioned at predetermined sites on a glass microscope . 

25 slide by robotic micropipening techniques. The DNA is attached to the glass surface 
along h s entire length by non-covalent electrostatic interactions. However, although 
hybridisation with complementary DNA sequences can occur, this approach may not 
permit the DNA to be freely available for interacting with other components such as 
polymerase enzymes, DN A-binding proteins etc. 

3 0 Recently, the Human Genome Project determined the entire sequence bf the 

human genome- all 3x1 0 s bases. The sequence information represents that of an average 
human However, there is still considerable interest in identifying differences in the 



genetic sequence between different individuals. The most common form of genetic 
variation is single nucleotide polymorphisms (SNPs} On average one base in 1 000 is a 
SNP ? which means that there are 3 million SNPs for any individual Some of the SNPs 
are in coding regions and produce proteins with different binding affinities or properties 
Some are m regulatory regions and result in a different response to changes in levels of 
metabolites or messengers. SNPs are also found in non-coding regions, and these are 
also important as they may correlate with SNPs in coding or regulatory regions. The Jcey 
problem is to develop a low cost way of determining one or more of the SNPs for an 
individual 

The nucleic acid arrays may be used to determine SNPs, and they have been used 
to study hybridisation events (Mirzabekov, Trends in Biotechnology (1994) 12*27-32) 
Many of these bybndisauon events are detected using fluorescent labels attached to 
nucleotides, the labels being detected using a sensitive fluorescent detector, e.g. a charge- 
coupled detector (CCD). The w^jor disadvantages of these methods are that it is not 
possible to sequence long stretches of DNA, and thai repeat sequences can lead to 
ambiguity in the results. These problems are recognised in Automation Technologies for 
Genome Characterisation, Wiley-Inters&ence(1997) 3 ed T J Beugelsdijk, Chapter 10- 
205-225 

In addition, the use of high-density arrays in a multi-step analysis procedure can 
lead to problems with phasing. Phasing problems result from a loss in the 
synchronisation of a reaction step occurring on different molecules of the array If some 
of the arrayed molecules fail to undergo a step in the procedure, subsequent resuhs 
obtained for these molecules will no longer be in step with resuhs obtained for the other 
arrayed molecules. The proportion of molecules out of phase will increase through 
successive steps and consequently the results detected will become ambiguous * This 
problem is recognised in the sequencing procedure described in US-A-5302509. 

An alternative sequencing approach is disclosed in EP-A-Q38I693, Tvhjch 
comprises hybridising a fluorescently-labeJled strand of DNA to a target ON A sample 
suspended in a flowing sample stream, and then using an exonuclease to cleave 
repeatedly the end base from the hybridised DNA The cleaved bases are detected in 
sequential passage through a detector, allowing reconstruction of the base sequence of 
the OKA. Each of the different nucleotides has a distinct fluorescent label attached, 
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which is detected by laser-induced fluorescence This is a complex method, primarily 
because it is difficult to ensure that everynucleotide of the DNA strand is labelled and 
that this has been achieved with high fidelity to the original sequence 

WO-A-96/27025 is a general disclosure of ingle molecule arrays Although 
5 sequencing procedures are disclosed, there is little description of the applications to 
which the arrays can be applied. There is also only a general discussion on how to 
prepare the arrays 
Summary of ihe Invention 

According to the present invention, a device comprises a high density array of 

1 o molecules capable of interrogation and immobilised on a solid generally planar surface, 

wherein the anay allows the molecules to be individually resolved by optical microscopy, 
and wherein each molecule is immobilised by covalent bonding to the surface, other than 
at that pan of each molecule that can be interrogated. 

According to a second aspect of the invention, a device comprises a high density 

1 5 array of relatively short molecules and relatively long polynucleotides immobilised on The 
surface of a solid support, wherein the polynucleotides are at a density that permits 
individual resolution of those parts that extend beyond the relatively short molecules ha 
this aspect, The shorter molecules can prevent non-specific binding of reagents to the 
solid support, and therefore reduce background interference 

20 According to a third aspect of the invention, a device comprises an array of 

polynucleotide molecules immobilised on a solid surface, wherein each molecule 
comprises a polynucleotide duplex linked via a covalent bond to form a hairpin loop 
structure, one end of which comprises a target polynucleotide, and the array has a surface 
density which allows the target polynucleotides to be individually resolved En this 

2 5 aspect, the hairpin structures act to tether the target to a primer polynucleotide This 

prevents loss of the primer-target dunng the washing steps of a sequencing procedure 
The hairpins may therefore improve the efficiency of the sequencing procedures. 

The arrays of the present invention comprise what are effectively single 
molecules. This has many important benefits for the study of the molecules and their 

3 o interaction with other biological molecules. In particular, fluorescence events occurring 

on each molecule can be detected using an optical microscope linked to a sensitive 
detector, resulting in a distinct signal for each molecule 



When used in a multi-step analysis of a population of single molecules, the 
phasing problems that are encountered using high density (multi*molecufc) arrays of the 
prior art, can be reduced or removed Therefore, the arrays also permit a massively 
parallel approach to monitoring fluorescent or other events on ihe molecules. Such 
5 massively parallel data acquisition makes the arrays extremely useful in a wide range of 
analysis procedures which involveihe screening/characterising of heterogeneous mixtures 
of molecules. The arrays can be used to characterise a particular synihetic chemical or 
biological moiety, for example in screening for particular molecules produced in 
combinatorial synthesis reactions. 

XO The arrays of the present invention are particularly suitable for use with 

polynucleotides as the molecular species The preparation of the arrays requires only 
small amounts of polynucleotide sample and other reagents, and can be carried out by 
simple means Polynucleotide arrays according to the invention permit massively parallel 
sequencing chemistries to be performed For example, the arrays permit simultaneous 

15 chemical reactions on and analysis of many individual polynucleotide molecules. The 
arrays are therefore very suitable for determining polynucleotide sequences. 

An array of the invention may also be used to generate a spatially addressable 
array of single polynucleotide molecules This is the simple consequence of sequencing 
the array. Particular advantages of such a spatially addressable array include the 

20 following 

1) Polynucleotide molecules on the array may act as identifier tags and may only 
need to be 10^20 bases long, and the efficiency required in the sequencing steps may only 
need to be better than 50%, as there will be no phasing problems 

2) The arrays may be reusable for screening once created and sequenced. All 
25 possible sequences can be produced ma very simple way, e.g compared to ahighdensity 

multi-molecule DN A chip made using photolithography 
Pescription of the Drawings 

Figure 1 is a schematic representation of apparatus that may be used to image 
arrays of the present invention; 
3 o Figure 2 illustrates the immobilisation of a polynucleotide to a solid surface via 

a microsphere; 



figure 3 shows a fluorescence time profile from a single fluoropbore-labeUed 
oligonucleotide, with excitation ax 5l4nm and detection at SQQim; 

Figure 4 shows fluorescently labelled single molecule 33NA covalently attached 
to a solid surface, and 

Figure 5 shows images of surface bound oligonucleotides hybridised with the 
complementary sequence. 
Description of the Invention 

According to the present invention, the single molecules immobilised onto the 
surface of a solid support should be capable of being resolved by optical means. This 
means that, within the resolvable area of the particular imaging device used there must 
be one or more distinct images each representing one molecule. Typically, the molecules 
of the array are resolved using a single molecule fluorescence microscope equipped with 
a sensitive detector, e g. a charge-coupled detector {CCD). Each molecule of the array 
may be analysed simultaneously or, by scanning the array, a fest sequential analysis can 
be performed 

The molecules of the array are typically DN A, RN A or nucleic acid mimics, e g 
FNA or 2 f -Q-Meth-RNA However, any other biomolecules, including peptides, 
polypeptides and other organic molecules, may be used The molecules are formed on 
the array to allow interaction with other "cognate" molecules. It is therefore important 
to immobihse the molecules so that the portion of the molecule not physically attached 
to solid support is capable of being interrogated by a cognate. Jn some applications all 
the molecules m the single array will be the same, and may be used to interrogate 
molecules that are largely distinct In other applications, the molecules on the array may 
all, or substantially all, be different, e.g. less than 50%, preferably less than 30% of the 
molecules will be the same. 

The term "single molecule* is used herein to distinguish from high density multi- 
molecule arrays in the pnor art which may comprise distinct dusters of many molecules . 
of the same type. - 

The term Individually resolved" is used herein to indicate that, when visualised, 
it is possible to distinguish one molecule on the array from its neighbouring molecules. 
Visualisation may be effected by the use of reporter labels, e.g fluorophores, the signal 
of which is individually resolved. 



The term "cognate molecule" is used herein to refer to any molecule capable of 
interacting, or interrogating, the arrayed molecule The cognate may be a molecule that 
binds specifically to the arrayed molecule, for example a complementary polynucleotide, 
in a hybridisation reaction. 
5 The term "interrogate" is used herein to refer to any interaction of the arrayed 

molecule with any other molecule The interaction may be covalent or non-covalent. 

The terms "arrayed polynucleotides" and "polynucleotide arrays" are used herein 
to define a plurality of single molecules that are characterised by comprising a 
polynucleotide The term is intended to include the attachment of other molecules to a 

10 solid surfece, ihe molecules having a polynucleotide attached that can be further 
interrogated. For example, the arrays may comprise protein molecules immobilised on 
a solid surface, the protein molecules bring conjugated or otherwise bound to a short 
polynucleotide molecule that may be interrogated, to address the array. 

The density of the arrays is not critical However, the present invention can make 

1 5 use of a high density of single molecules, and these are preferable. For example, arrays 
with a density of iOMQ 9 molecules per cm 2 may be used. Preferably, the density is at 
least 10 7 /cm 2 and typically up to lOVcm 2 . These high density arrays are in contrast to 
other arrays which may be described in the an as "high density* but which are not 
necessarily as high and/or which do not allow single molecule resolution 

2 o Using the methods and apparatus of the present invention, it may be possible to 

image at least 10 7 or 10* molecules simultaneously. Fast sequential imaging may be 
achieved using a scanning apparatus; shifting and transfer between images may allow 
higher numbers of molecules to be imaged. 

The extent of separation between the individual molecules on the array will be 

2 5 determined, in pan, by the particular technique used to resolve the individual molecule. 
Apparatus used to image molecular arrays are known to those skilled in the an For 
example, a confocal scanning microscope may be used to scan the surface of the array 
with a laser to image directly a fluorophore incorporated on the individual molecule by 
fluorescence. This may be achieved using the apparatus illustrated in Fig l,fig 1 shows 

30 a detector 1 a a bandpass filter 2, a pinhole 3, a mirror 4, a laser beams 5, a dichroic mirror 
6, an objective 7, a glass coverslip S and a sample 9 under study- Alternatively, a 



sensitive 2-D detector, such as a charge-coupled detector, can be used to provide a 2-D 
image representing the individual molecules on the array 

Resolving single molecules on the array with a 2-0 detector can be done i£ at 1 00 
x magnification, adjacent molecules are separated by a distance of approximately at least 
5 250nm, preferably at least 300nm and more preferably at least 3S0nm It will be 
appreciated that these distances are dependent on magnification, and that other values 
can be determined accordingly, by one of ordinary skill in the art 

Other techniques such as scanning near-field optical microscopy (SNOM) are 
available which are capable of greater optical resolution, thereby permitting more dense 
1 o arrays to be used. For example, using SNOM, adjacent molecules may be separated by 
a distance ofless than lOOnm, eg. lOnm For a description ofscanning near-field optical 
microscopy, see Moyer et at, Laser Focus World (1993) 29(10) 

An additional technique that may be used is surface-specific total internal 
reflection fluorescence microscopy (TTRFM), see, for example, Vale */ al > Nature, 
15 (1996) 380. 45 1-453). Using tins technique, it is possible to achieve wide-field imaging 
(up to 100 jon x 100 yxa) with single molecule sensitivity This may allow arrays of 
greater than 10 7 resolvable molecules per cm 3 to be used. 

Additionally, the techniques ofscanning tunnelling microscopy (Binnig & a/ , 
Helvetica Physica Acra(19S2) 55 :726-735) and atomic force microscopy (Hansmatf a/., 
20 Ann. Rev Biophys Biotnol Struct (1994) 23-115-139) are suitable for imaging the 
arrays of the present invention. Other devices which do not rely on microscopy may also 
be used, provided that they are capable of imaging within discrete areas on a solid 
support. 

Single molecules may be arrayed by immobilisation to the surfece of a solid 
25 support This may be carried out by any known technique, provided that suitable 
conditions are used to ensure adequate separationof the molecules Generally the array 
is produced by dispensing small volumes of a sample containing a mixture of molecules 
onto a suitably prepared solid surfece, or by applying a dilute solution to the solid surfece 
to generate a random array In this manner, a mixture of different molecules may be 
3 o arrayed by simple means The formation of the single molecule array then permits 
interrogation of each arrayed molecule to be earned out 
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Suitable solid suppons are available commercially, and will be apparent to the 
skilled person The suppons may be manufactured from materials such as glass, 
ceramics, silica and silicon. Tiie suppons usually comprise a flat (planar) surface, or at 
Jeast an array in which the molecules to be interrogated are in the same plane. Any 
suitable size may be used. For example, the supports might be of the order of MO cm 
in each direction. 

It is important to prepare the solid support under conditions which minimise or 
avoid the presence of contaminants The solid support must be cleaned thoroughly, 
preferably with a suitable detergent, eg. Decon-90, to remove dust and other 
contaminants. 

Immobilisation may be by specific covalent or non-covalent interactions. 
Covalent attachment is preferred If the molecule is a polynucleotide, immobilisation will 
preferably be at either the 5' or 3' position, so that the polynucleotide is attached to the 
solid support at one end only. However, the polynucleotide may be attached to the solid 
support at any position along its length, the attachment acting to tether the 
polynucleotide to the solid support. The immobilised polynucleotide is then able to 
undergo interactions with other molecules or cognates at positions distant from the solid 
support Typically the interaction will be such that it is possible to remove any molecules 
bound to the solid suppon through non-specific interactions, e.g by washing 
Immobilisation in this manner results in well separated single molecules The advantage 
of this is that it prevents interaction between neighbouring molecules on the array, which 
may hinder interrogation of the array 

In one embodiment of the invention, the surface of a solid suppon is first coated 
with streptavidin or avidin, and then a dilute solution of a biotinylated molecule is added 
at discrete sites on the surfece using, for example, a nanolitre dispenser to deliver one 
molecule on average to each she 

In a prefened embodiment of the invention, the solid surface is coated with an 
epoxide and the mojecules are coupled via an amine linjcage. It is also preferable to avoid 
or reduce salt present in the solution containing the molecule to be arrayed Reducing 
the salt concentration minimises the possibility of the molecules aggregating in the 
solution, which may affect the positioning on the array. 



If the molecule is a polynucleotide, then immobilisation may be via hybridisation 
to a complementary nucleic acid molecule previously attached to a solid support For 
example, the surface of a solid support may be first coated with a primer polynucleotide 
at discrete sites on the surface Single-stranded polynucleotides are then brought into 
contact with the arrayed primers under hybridising conditions and allowed to "self-sort" 
onto the array. In this way, the arrays may be used to separate the desired 
polynucleotides from a heterogeneous sample of polynucleotides 

Alternatively, the arrayed primers may be composed of double-stranded 
polynucleotides with a single-stranded overhang ("sticky-ends"). Hybridisation with 
target polynucleotides is then allowed to occur and a J5NA ligase used to covateutiy Jink 
the target DNA to the primer. The second DNA strand can then be removed under 
melting conditions to leave an arrayed polynucleotide 

In an embodiment of the invention, the target molecules are immobilised onto 
non-fluorescent streptavidin or avidin-fimctionalised polystyrene latex microspheres* as 
shown in Fig. 2, Fig 2 shows a microsphere U, a streptavidin molecule 12, a biotin 
molecule 13 and a fluorescently labelled polynucleotide 14. The microspheres are 
immobilised in turn onto a solid support to fix the target sample for microscope analysis. 
Alternative microspheres suitable for use in the present invention are well known in the 
art 

In one aspect of the present invention, the devices comprise arrayed 
polynucleotides, each polynucleotide comprising a hairpin loop structure, one end of 
which comprises a target polynucleotide, the other end comprising a relatively short 
polynucleotide capable of acting as a primer in the polymerase reaction. This ensures 
that the primer is able to perform its pnming function during a polymerase-based 
sequencing procedure, and is not removed during any washing step in the procedure. 
The target polynucleotide is capable of being interrogated 

The term "baiipin bop structure" refers to a molecular stem and loop structure 
formed from the hybridisation of complementary polynucleotides that are covalently 
linked. The stem comprises the hybridised polynucleotides and the loop is the region that 
covalently links the two complementary polynucleotides Anything from a 10 to 20 (of 
more) base pair double-stranded (duplex) region may be used to form the stem In one 
embodiment, the structure may be formed from a single-stranded polynucleotide having 
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complementary regions. The loop in this embodiment may be anything from 2 or more 
non-hybridised nucleotides In a second embodiment, the structure is formed from two 
separate polynucleotides with complementary regions, the two polynucleotides being 
linked (and the loop being at least partially formed) by a linker moiety . The linker moiety 
5 forms a covalem attachment between the ends of the two polynucleotides. Linker 
moieties suitable for use in this embodiment will be apparent to the skilled person For 
example, the linker moiety may be polyethylene glycol (PEG). 

There are many different ways of forming the hairpin structure to incorporate the 
target polynucleotide However, a preferred method is to form a first molecule capable 
10 of forming a hairpin structure, and ligate the target polynucleotide to this. Ligation may 
be earned out either prior to or after immobilisation to the solid support. The resulting 
structure comprises the single-stranded target polynucleotide at one end of the hairpin 
and a primer polynucleotide at the other end 

In one embodiment, the target polynucleotide is genomic DNA purified using 
1 5 conventional methods. The genomic DNA may be PCR-amplified or used directly to 
generate fragments of DNA using either restriction endonucteases, other suitable 
enzymes, amechanicalformoffragmentationoranon-enzymatic chemical fragmentation 
method. In the case of fragments generated by restriction endonucleases, hairpin 
structures bearing a complementary restriction site at the end of the first hairpin may be . 
2 0 used, and selective ligation of one strand of the DNA sample fragments may be achieved 
by one of two methods 

Method 1 uses a first hairpin whose restriction site contains a phosphorylated S 1 
end. Using tins method, it may be necessary to first de-phosphorylate the restriction- 
cleaved genomic or other DNA fragments prior to ligation such that only one sample 
25 strand is covalently Ugated to the hairpin. 

Method 2. in the design of the hairpin, a single (or more.) base gap can be 
incorporated at the 3' end (the receded strand) such that upon ligation of the DNA 
fragments only one strand is covalently joined to the hairpin. The base gap can be fanned 
by hybridising a farther separate polynucleotide to the 5-end of the first hairpin structure. 
30 On ligation, the DNA fragment has one strand joined to the S'-end of the first hairpin, and 
the other strand joined to the 3'^end of the further polynucleotide. The further 
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polynucleotide (and The other strand of the DNA fragment) may then be removed by 
disrupting hybridisation 

In either case, the net result should be covalent tigauon of only one strand of a 
DNA fragment of genomic or other DNA, xo the hairpin. Such ligation reactions may 
5 be carried out in solution at optimised concentrauons based on conventional ligation 
chemistry, for example, carried out by ONA ligases or non-ensymatic chemical ligation 
Should the fragmented DNA be generated by random shearing of genomic DNA or 
polymerase, then the ends can be filled in with Kleno w fragment to generate blunt-ended 
fragments which may be blunt-end-ligated onto blunt-ended hairpins Alternatively, the 

10 blunt-ended ONA fragments may be Ugaxed to oligonucleotide adapters which are 
designed to allow compatible ligation with the sicky-end hairpins, in the manner 
described previously 

The hairpin-bgated DNA constructs may then be covalently attached to the 
surface of a solid support to generate a single molecule array (SMA), or ligation may 

15 follow attachment to form the array. 

The arrays may then be used in procedures to determine the sequence of the 
target polynucleotide. If the target fragments are generated via restriction digest of 
genomic DNA, the recognition sequence of the restriction or other nuclease enzyme will 
provide 4, 6, 8 bases or more of known sequence (dependent on the enzyme) Further 

2 0 sequencing of between 10 and 20 bases on the SMA should provide sufficient overall 
sequence information to place that stretch of DNA into unique context with a total 
human genome sequence, thus enabling the sequence information to be used for 
genotyping and more specifically single nucleotide polymorphism (SNP) scoring. 

Simple calculations have suggested the following based on sequencing a 10 ? 

2 5 molecule SMA prepared from hairpin ligation, tor a 6 base pair recognition sequence, a 
single restriction enzyme will generate approximately 10* ends of DNA If a stretch of 
13 bases is sequenced on the SMA (i.e 13 x 1G* bases), approximately 13,000 SNPs will 
be detected. One application of such a sample preparation and sequencing format would 
in general be for SNP discovery in pharmaco-genetic analysis The approach is therefore 

30 suitable for forensic analysis or any other system which requires unambiguous 
identification of individuals to a level as low 10 3 SNPs. 
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It is of course possible to sequence the complete Target polynucleotide, if 
required. 

In a separate aspect of the invention, the devices may comprise immobilised 
polynucleotides and other immobilised molecules. The other molecules are relatively 
5 short compared to the polynucleotides and are intended to prevent non-specific 
attachment of reagents, e.g fluorophores, with the solid support, thereby reducing 
background interference in one embodiment, the other molecules are relatively short 
polynucleotides. However, many different molecules may be used, e g. peptides, 
proteins, polymers and synthetic chemicals, as will be apparent to the skilled person 

1 0 Preparation of the devices may be carried out by first preparing a mixture of the relatively 
Jong polynucleotides and of the relatively short molecules. Usually, the concentration of 
The latter will be in excess of that of the long polynucleotides. The mixture is then placed 
in conracr with a suitably prepared solid support, to allow immobilisation to occur. 

The single molecule arrays have many applications in methods which rely on the 

1 5 detection of biological or chemical interactions with arrayed molecules. For example, the 
arrays may be used to determine the properties or identities of cognate molecules. 
Typically, interaction of biological or chemical molecules with the arrays are carried out 
in solution 

In particular, the arrays may be used in conventional assays which rely on the 
20 detection of fluorescent labels to obtain information on the arrayed molecules. The 
arrays are particularly suitable for use in multi-step assays where the loss of 
synchronisation in the steps was previously regarded as a limitation to the use of aarays 
When the arrays are composed of polynucleotides they may be used in conventional 
techniques for obtaining genetic sequence information Many of these techniques rely 
25 on the stepwise identification of suitably labelled nucleotides, referred to in US-A- 
5634413 as "single base" sequencing methods. 

In an embodiment of the invention, the sequence of a target polynucleotide is 
determined in a similar manner to that described in US-A-5634413, by detecting the 
incorporation of nucleotides into the nascent strand through the detection of afluorescem 
3 0 label attached to the incorporated nucleotide. The target polynucleotide is primed with 
a suitable primer (or prepared as a hairpin construct which will contain the primer as part 
of the hairpin), and the nascent chain is extended in a stepwise manner by the polymerase 
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reaction. Each of the different nucleotides (A, T, G and C) incorporates a unique 
fluorophore at the 3' position which acts as a blocking group to prevent uncontrolled 
polymerisation The polymerase enzyme incorporates a nucleotide into the nascent chain 
complementary to the target, and the blocking group prevents further incorporation of 
5 nucleotides. The array surface is then cleared of unincorporated nucleotides and each 
incorporated nucleotide is "read" optically by a charge-coupled detector using laser 
excitation and filters The 3' -blocking group is then removed {deprotectedX to expose 
The nascent chain for further nucleotide incorporation 

Because the array consists of distinct optically resolvable polynucleotides, each 

1 0 target polynucleotide will generate a series of distinct signals as the fluorescent events 
are detected. Details of the full sequence are then determined. 

The number of cycles that can be achieved is governed principally by the yield of 
the deprotection cycle If deprotection fails in one cycle, it is possible that later 
deprotection and continued incorporation of nucleotides can be detected during the next 

1 5 cycle. Because the sequencing is performed at the single molecule level, the sequencing 
can be carried out on different polynucleotide sequences at one time without the 
necessity for separation of the different sample fragments prior to sequencing This 
sequencing also avoids the phasing problems associated with prior an methods 

Deprotection may be earned out by chemical, photochemical or enzymatic 

20 reactions. 

A similar, and equally applicable, sequencing method is disclosed in EP-A- 
0640146 

Other suitable sequencing procedures will be apparent to the skilled person In 
particular, the sequencing method may rely on the degradation of the arrayed 
2 5 polynucleotides, the degradationproducts being characterised to determine the sequence. 

An example of a suitable degradation technique is disclosed in WO-A- 95/20053, 
whereby bases on a polynucleotide are removed sequentially, a predetermined number 
at a time, through the use of labelled adaptors specific for the bases, and a defined 
exonuclease cleavage. 

30 A consequence of sequencing using non-destructive methods is that it is possible 

to form a spatially addressable array for further characterisation studies, and therefore 
non-destructive sequencing may be preferred In this context, term "spatially 
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addressable" is used hereto to describe how different molecules may be identified on the 
basis of their position on an array. 

Once sequenced, the spatially addressed arrays may be used in a variety of 
procedures which require the characterisation of individual molecules from 
heterogeneous populations 

One application is to use the arrays to characterise products synthesised in 
combinatorial chemistry reactions. Puring combinatorial synthesis reactions, it is usual 
for a tag or label to be incorporated onto a beaded support or reaction product for the 
subsequent characterisation of the product Tins is adapted in the present invention by 
using polynucleotide molecules as the tags, each polynucleotide being specific for a 
particular product, and using the tags to hybridise onto a spatially addressed array 
Because the sequence of each arrayed polynucleotide has been determined previously, 
the detection of an hybridisation event on the array reveals the sequence of the 
complementary tag on the product. Having identified the tag, it is then possible to 
confirm which product this relates to. The complete process is therefore quick and 
simple, and the airays may be reused for high through-put screening. Detection may be 
carried out by attaching a suitable label to the product, e.g. a fluorophore. 

Combinatorial chemistry reactions may be used to synthesize a diverse range of 
different molecules, each of which may be identified using the addressed airays of the 
present invention For example, combinatorial chemistry may be used to produce 
therapeutic proteins or peptides that can be bound to the arrays to produce an addressed 
airay of target proteins. The targetsmayrhenbescreenedfor activity, and those proteins 
exhibiting activity may be identified by their position on the array as outlined above. 

Similar principles apply to other products of combinatorial chemistry, for example 
the synthesis of non-polymeric molecules of m.wt <1000 Methods for generating 
peptides/proteins by combinatorial methods are disclosed in US-A-5643768 and LJS-A- 
565S7H Split-and-mix approaches may also be used, as described in Nielsen a?.. 1" 1 
Am Chem Soc (1993)115.9812-9813 

In an alternative approach, the products of the combinatorial chemistry reactions 
may comprise a second polynucleotide tag not involved in the hybridisation to the array. 
After formation by hybridisation, the array may be subjected to repeated polynucleotide 
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sequencing to identify the second tag which remains free. The sequencing may be carried 
out as described previously 

Therefore, in ibis application, it is the tag that provides the spatial address on the 
array. The tag may then be removed from the product by, for example, a deavable 
linker, to leave an untagged spatially addressed array. 

A further application is to display proteins via an immobilised polysome 
containing trapped polynucleotides and protein in a complex, as described in 
US-A-5643768 and US-A-5658754. 

In a separate embodiment of the invention, the arrays maybe used to characterise 
an organism For example, an organism's genomic ONA may be screened using the 
arrays, to reveal discrete hybridisation patterns that are unique to an individual. This 
embodiment may therefore be likened to a "bar code" for each organism. The organism's 
genomic DNA may be first fragmented and detectably-labelled, for example with a 
fluorophore. The fragmented DNA is then applied to The array under hybridising 
conditions and any hybridisation events monitored. 

Alternatively, hybridisation may be detected using an in-built fluorescence based 
detection system in the arrayed molecule, for example using the "molecular beacons" 
described in Nature Biotechnology (1996) 14.303-308 

It is possible to design the arrays so that the hybridisation pattern generated is 
unique to the organism and so could be used to provide valuable information on the 
genetic character of an individual This may have many useful applications in forensic 
science. Alternatively, the methods may be carried out for the detection of mutations or 
allelic variants within the genomic DNA of an organism. 

For genotyping, it is desirable to identify if a particular sequence is present in the 
genome The smallest possible unique oligomer js a 16-mer (assuming randomness of 
the genome sequence), i.e. statistically there is a probability of any given 16-basie 
sequence occurring only once in the human genome (which has 3 x 1 D* bases) There are 
c.4 x 1 if possible 16-mers which would fit within a region of 2 cm x 2 cm (assuming a 
angle copy at a density of 1 molecule per 250 nm x 250 nm square). It is therefore 
necessary to determine only if a particular 16-mer is present or not, and so quantitative 
measurements are unnecessary. Identifying a mutation in a particular region and what 
the mutation is can be carried out using the 16-mer library Mapping back onto the 
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human genome would be possible using published data and would not be a problem once 
the entire genome has been determined There is built-in self-check, by looking at the 
hybridisauon to particular J6-mers so that if there is a single point mutation, this will 
show up in 16 different 16-mers, identifying a regiou of 32 bases in the genome (the 
5 mutation would occur at the top of one 16-mer and then at the second base in a related 
!6-mer etc) Thus, a single point mutation would result in 16 of the 16- rners not 
showing hybridisation and a new set of 1 6 showing hybridisation plus the same thing for 
the complementary strand In summary, considering both strands of DNA, a single point 
mutation would result in 32 of the 16-mers not showing hybridisation and 32 new 16- 

1 o mers showing hybridisation, i e quite large changes on the hybridisation pattern to the 

array. 

By way of example, a sample ofhuman genomicDN A may be restriction-digested 
to generate short fragments, then labelled using a fluorescendy-labelled monomer and a 
DNA polymerase or a terminal transferase enzyme. This produces short lengths of 
1 5 sample DMA with a fluorophore at one end. The melted fragments may then be exposed 
to the array and the pixels where hybridisation occurs or not would be identified This 
produces a genetic bar code for the individual with (if oligonucleotides of length 16 were 
used) c 4 x 10 v binary coding elements This would uniquely define a person's genotype 
for pharmagenomic applications Since the arrays should be reusable, the same process 

2 0 could be repeated on a different individual. 

In one embodiment of the invention, a method for determining a single nucleotide 
polymorphism (SNP) present in a genome comprises immobilising fragments of the 
genome onto the surface of a solid support to form an array as defined above, identifying 
nucleotides at selected positions in The genome, and comparing the results with a known 

2 5 consensus sequence xo identify any differences between the consensus sequence and the 

genome Identifying the nucleotides at selected positions in the genome may be carried 
out by contacting the array sequentially with each of the bases A, T, G and C, under - 
conditions that permit the polymerase reaction to proceed, and monitoring the 
incorporation of a base at selected positions in the complementary sequence, 

3 o The fragments of the genome may be unamplified DNA obtained from several 

cells from an individual, which is treated with a restriction enzyme. As indicated above, • 
it is not necessary to determine the sequence of the M fragment- For example, it may 
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be preferable to determine the sequence of 16-30 specific bases, which is sufficient to 
identify the DN A fragment by comparison to a consensus sequence, e g. to that known 
from the Human Genome Project. Any SNP occurring within the sequenced region can 
then be identified. The specific bases do nor have to be contiguous For example, the 
5 procedure may be carried out by the incorporation of non-labelled bases followed, at pre- 
determined positions, by the incorporation of a labelled base. Provided that the sequence 
of sufficient bases is determined, it should be possible to identify the fragment Again, 
any SNPs occurring at the determined base positions, can be identified- For example, the 
method may be used to identify SNPs chat occur after cytosine. Template DMA 

10 (genomic fragments) can be contacied with each of the bases A, T and G, added 
sequentially or together, so that the complementary strand is extended up to a position 
that requires C Non-incorporated bases can then be removed from the array, followed 
by the addition of C. The addition of C is followed by monitoring the next base 
incorporation (using a labelled base) By repeating this process a sufficient number of 

15 times, a partial sequence is generated where each base immediately following a C is 
known It will then be possible to identify the full sequence, by comparison of the partial 
sequence to a reference sequence It will then also be possible to determine whether 
there are any SNPs occurring after any C 

To further illustrate this, a device may compose 1 0 7 restriction fragments per cm 2 . 

2 0 If 30 bases are determined for each fragment, this means 3 x 10 s bases are identified. 
Statistically, this should determine 3 x 10 s SNPs fcr the experiment, if the fragments 
each comprise 1000 nucleondes, it is possible to have 10 10 nucleotides per cm 2 , or three 
copies of the human genome The approach therefore permits large sequence or SNP 
analysis to be performed 

2 5 Viral and bacterial organisms may also be studied, and screening nucleic acid 

samples may reveal pathogens present in a disease, or identify microorganisms in 
analytical techniques For example, pathogenic or other bacteria may be identified using 
a series of single molecule DNA chips produced from different strains ofbacteria Again; 
these chips are simple to make and reusable. 

30 In a further example, double-stranded arrays may be used to screen protein 

libraries for binding, using fiuorescently labelled proteins This may determine proteins 
that bind to a particular DNA sequence, ie proteins that control transcription. Once the 
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short sequence that the protein binds to has been determined, it may be made and affinity 
purification used to isolate and identify The protein. Such a method could find all the 
transcription-controlling proteins One such method is disclosed in Nature 
Biotechnology (1999) 17.573-577. 
5 Another use is in expression monitoring. For this, a label is required for each 

gene. There are approximately 100,000 genes in the human genome There are 262, 144 
possible 9- mers 7 so this is the minimum length of oligomer needed to have a unique tag 
for each gene This 9-mer label needs to be at a specific point in the DMA and the best 
point is probably immediately after the po]y-A tail in the mRNA {ie a 9-mer linked to 

10 a poly-T guide sequence). Multiple copies of these 9-mers should be present, to pennit 
quantitation of gene expression 100 copies would allow determination of relative 
expression from 1-100% 10,000 copies would allow detennination of relative gene 
expression from .01-100% 10,000 copies of 262,144 9-rners would fit inside 1 cm x 1 
cm at close to maximum density. 

15 The use of nanovials in conjunction with any of the above methods may allow a 

molecule to be cleaved from The surfcee, yet retain its spatial integrity. This permits the 
generation of spatially addressable arrays of sin#e molecules in free solution, which may 
have advantages where the surface attachment impedes the analysis (e.g drug screening) 
A nanovial is a small cavity in a flat glass surface, e g. approx 20 pm in diameter and 10 

20 \m deep They can be placed every 50 ^m, and so the array would be less dense than 
a surface-attached array; however, this could be compensated for by appropriate 
adjustment in the imaging optics. 

The following Examples illustrate the invention, with reference to the 
accompanying drawings. 

25 Example 1 

The microscope set-up used in the following Example was based on a modified 
confocal fluorescence system using a photon detector as shown in Figure 1. Briefly, a 
narrow, spatially filtered laser beam (CW Argon Ion User Technology RPC5G) was 
passed through an acousto-optic modulator (AOM) (A A Opto-Electronic) which acts 
30 as a fast optical switch The acousto-optic modulator was switched on and the laser 
beam was directed through an oil emersion objective (100 X, NA =13) of an inverted 
optical microscope (Nikon Diapbot 200) by a dichroic beam splitter (540DRIJP02 or 
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505DRLP02, Omega Optics Jnc) The objective fosses the Iightto a diifractian-hmited 
spot on the target sample immobilised on a thin glass coverslip Fluorescence from the 
sample was collected by the same objective, passed through the dichroic beam splitter 
and directed through a 50 ^m pinhole (Newport Corp.) placed m the image plane of the 
5 microscope observation port The pinhole rejects light emerging from the sample which 
is our of the plane of the laser focus. The transmitted fluorescence was separated 
spectrally by a dichroic beam splitter into red and green components which was filtered 
to remove residual laser scatter The remaining fluorescence components were then 
focused onto separate single photon avalanche diode detectors and the signals recorded 

1 0 onto a multichannel scalar (MCS) (MCS-Plus, EG & G Ortec) with time resolutions in 
the I to 10 ms range. 

The target sample was a 5 r -biotin-modified 13-mer primer oligonucleotide 
prepared using conventional phosphoramidite cheroistty, and having SEQ ID No I (see 
listing, below). The oligonucleotide was post-synthetically modified by reaction of the 

1 5 uridine base with the succinundyl ester of tetramethylrhodamine (TMR). 

Glass coverslips were prepared by cleaning with acetone and drying under 
nitrogen. A SO jd aliquot of biotin-BSA (Sigma) redissolved in PBS buffer (0.01 M, pH 
7.4) at I mg/ml concentration was deposited on the clean coverslip and incubated for 8 
hours at 30°C. Excess biotin-BSA was removed by washing 5 times with MilliQ water 

20 and drying under nitrogen. Non-fluorescent srrejnavidin functionalist polystyrene latex 
microspheres of diameter 500nra(PoIyscienceslnc) were diluted in lOOmMNaCltoO 1 
solids and deposited as a I *d drop on the biotinylated coverslip surface. The spheres 
were allowed to dry for one hour and unbound beads removed by washing 5 times with 
MilliQ water This procedure resulted in a surface coverage of approximately 1 

25 sphere/100 pmx 100 ^m 

The non-fluorescent microspheres were found to have a broad residual 
fluorescence at excitation wavelength SHnm, probably arising from small quantities of 
photoactive constituents used in the colloidal preparation of the microspheres. The 
microspheres were therefore photobleached by treating the prepared coverslip in a laser 

3 0 beam of a frequency doubled (532wn) Nd. YAG pulsed dye laser, for 1 hour. 

The biotinylated 13-TMR ssDNA was coupled to the streptavidin functionalised 
microspheres by incubating a 50 pi sample of 0. 1 pM DNA (diluted in 100 mM NaCl, 
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100 raM Tris) deposited over the microspheres Unbound DNA was removed by 
washing the coverslip surface 5 times with M3JUQ water. 

Low light level illumination from the microscope condenser was used to position 
visually a microsphere at 1 Ox magnification so that when the laser was switched on the 
5 sphere was located in the centre oftbedi&actioaliiruted focus The condenser was then 
Turned off and the light path switched to the fluorescence detection port The MCS was 
initiated and the fluorescence omitted from the latex sphere recorded on one or both 
channels The sample was excited at 5l4nm and detection was made on the 600nm 
channel. 

10 Figure 3 shows clearly that the fluorescence is switched on as the laser is 

deflected too the microscope by the AGM, 0 5 seconds after the start of a scan The 
intensity of the fluorescence remains relatively constant for a short period of time (100 
ms-3s) aod disappears in a single step process The results show that angle molecule 
detection is occurring This single step photobleacbing is unambiguous evidence that the 

1 5 fluorescence is from a single molecule. 
Example 2 

This Example illustrates the preparation of single molecule amays by direct 
covalent attachment to glass followed by a demonstration of hybridisation to the array. 
Covalently modified slides were prepared as follows Spectrosil-2000 slides 

20 (TSU UK) were rinsed in milli-Q to remove any dust and placed wet in a bottle 
containing neat Decon-90 and left for 12 h at room temperature. The slides were rinsed 
with roilli-Q and placed in a bottle containing a solution of 1 5% 
glyddoxypropyltrimethoxy-siJane in milli-Q and magnetically stirred for 4 h at room 
temperature rinsed with milli-Q and dried under N 2 to liberate an epoxide coated surface, 

2 5 The £>NA used was that shown in SEQ ID No. 2 (see sequence listing below), 

where n represents a 5-metbyl cytosine (CyS) with a TMR group coupled via a linker to : 
the n4 position 

A sample of this (5 ^1, 450 pM) was applied as a solution in neat mflli-Q 
The J3N A reaction was left for 12 h at room temperature b a humid atmosphere 
30 to couple to the epoxide surface The slide was then nnsed with milli-Q and dried under 

N 2 . 
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The prepared slides can be siored wrapped in foil in a desiccator for at least a 
week without any noticeable contamination or loss of bound material. Control DNA of 
the same sequences and fluoropbore but without the 5-amino group shows little stable 
coverage when applied at the same concentration. 
5 The TMR labelled slides were xhen treated with a solution of complementary 

DNA (SEQ ID No. 3) (SpM, 1 0^1) in lOOmM PBS The complementary DMA has the 
sequence shown in SEQ ID No. 3, where n represents a metbylcytosine group 

After 1 hour at room temperature the slides were cooled to 4°C and left for 24 
hours Finally, the slides were washed in PBS (lOOmM, ImL) and dried under N 2 . 

10 A chamber was constructed on the slide by sealing a coverslip (No 0, 22x22mm, 

Chance Propper Ltd, UK) over the sample area on two sides only with prehardened 
microscope mounting medium (fiukitt, O. Kindler GmbH& Co , Freiburg, Germany) 
whilst maintaining a gap of less than 20(tym between slide and coverslip The chamber 
was flushed 3x with lQQjil PBS (lOOmM NaCl) and allowed to stabilise for 5 minutes 

1 5 before analysing on a fluorescence microscope 

The slide was inverted so that the chamber coverslip contacted the objective lens 
of an inverted microscope (Nikon TE200) via an immersion oil interface. A 60° fused 
silica dispersion prism was optically coupled to the back of the slide through a thin film 
of glycerol Laser light was directed at the prism such that at the glass/sample interface 

20 it subtends an angle of approximately 68° to the normal of the slide and subsequently 
undergoes Total Internal Reflection (TIR) The critical angle for glass/water interface 
is 66° 

Fluorescence from angle molecules of DNA-TMR or DNA-Cy5 produced by 
excitation with the surface specific evanescent wave following TIR is collected by the 

2 5 objective lens of the microscope and imaged onto 911 Intensified Charge Coupled Device 

(1CCD) camera (Pentamax, Princeton Instruments, NJ) Two images were recorded 
using a combination of 1) 532nm excitation (frequency doubled solid state NdYAQ 
Antares, Coherent) with a 580nm fluorescence (58GDF30, Omega Optics, USA) filter 
for TMR and 2) 630mn excitation (nd YAG pumped dye laser, Coherent 700) with a 

3 0 670nm filter (670OF40, Omega Optics, USA) for Cy5 Images were recorded with an 

exposure time of 500ms at the maximum gain of 10 on the ICCD. Laser powers incident 
at the prism were 50roW and 40mW at 532nm and 630nm respectively A third image 
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was taken with 532nm excitation and detection ar 670nm to determine the level of cross- 
talk from TMR on the Cy5 channel 

Single molecules were identified by single points of fluorescence with average 
intensities greater than 3x that of the background. Fluorescence from a single molecule 
5 is confined to a few pixels, typically a 3x3 matrix at 1 00x magnification, and has a narrow 
Gaussian-like intensity profile Single molecule fluorescence is also characterised by a 
one-step photobleacbing process in the time course of the intensity and was used to 
distinguish single molecules from pixel regions containing two or more molecules, which 
exhibited multi-step processes. Figures 4a and 4b show 60^m x 60\im fluorescence 

1 0 images from covalendy modified slides witbDN A-TMR starting concentrations of 45pM 
and 450pM Figure 4c shows a control slide which was treated as above but with DNA- 
TMR lacking the 5' amino modification. 

To count molecules, a threshold for fluorescence intensities is first set to exclude 
background noise. For a control sample, the background is essentially the thermal noise 

15 of the ICCD measured to be 76 counts with a standard deviation of only 6 counts. A 
threshold is arbitrarily chosen as a linear combination of the background, the average 
counts over an image and the standard deviation over an image. In general, the latter two 
quantities provide a measure of the number of pixels and range of intensities above 
background This method gives rise to threshold levels which are at least 12 standard 

20 deviations above the background with a probability of less than 1 in 144 pixels 
contributing from noise By defining a single molecule fluorescent point as being at least 
a 2x2 matrix of pixels and no larger than a 7x7, the probability of a single background 
pixel contributing to the counting is eliminated and clusters are ignored. 

Jn this manner, the surface density of single molecules of DNA-TMR is measured 

25 at 2.9xi0 6 molecules/cm 2 (238 molecules in Figure 4a) and 5 8xlG 6 molecules/cm 2 (469 
molecules in Figure 4b) at 45pM and 450pM DNA-TMR coupling concentrations. The 
density is clearly not directly proportional to DNA concentration but will be some 
function of the concentration, the volume of sample applied, the area covered by the 
sample and the incubation time. The percentage of non-specifically bound DNA-TMR 

30 and impurities contribute of the order of 3-9% per image (8 non-specifically bound 
molecules in Figure 4c). Analysis of the photobleacbing profiles shows only 6% of 
fluorescence points contain more than 1 molecule 



23 

Hybridisation was identified by the co-localisation of discreet points of 
fluorescence from single molecules of TMR and Cy-5 following the superposition of two 
images. Figures 5a and 5b show images of surface bound 20-mer labelled with TMR and 
ibe complementary 20-mer labelled with Cy-5 deposited from solution Figure Sd shows 
5 those fluorescent points that are co-localised on the two former images The degree of 
hybridisation was estimated to be 7% of the surface- bound DNA (1 0 co-localised points 
m 141 points from Figures 5d and 5a, respectively) The percentage of hybridised DNA 
is estimated to be 37% of all surface-adsorbed DIMA-CyS (10 co-localised points in 27 
points from Figures 5d and 5b, respectively) Single molecules were counted by 

10 matching size and intensity of fluorescent points to threshold criteria which separate 
single molecules from background noise and cosmic rays. Figure 5d shows the level of 
cross-talk from TMR on the Cy5 channel which is 2% as determined by counting only 
those fluorescent points which fall within the criteria for determining the TMR single 
molecule fluorescence (2 fluorescence points in 141 points from Figures 5c and 5a, 

3,5 respectively) 

This Example demonstrates that single molecule arrays can be formed, and 
hybridisation events detected according to the invention. It is expected that the skilled 
person will realise that modifications may be made to improve the efficiency of the 
process For example, improved washing steps, e g using a flow cell, would reduce 

2 o background noise and permit more concentrated solutions to be used, and hybridisation 

protocols could be adapted by varying the parameters of temperature, buffer, time e*c 
Example 3 

This experiment demonstrates the possibility of performing enzymatic 
incorporation on a single molecule array In summary, primer DNA was attached to the 
25 surface of a solid support, and template DMA hybridised thereto Two cycles of 
incorporation of fluorophore-labelled nucleotides was then completed. This was 
compared against a reference experiment where the immobilised DNA was pre-labelled 
with the same two fluorophores prior to attachment to the surface, and control 
experiments performed under adverse conditions for nucleotide incorporation. 

3 0 The primer DNA sequence and the template DNA sequence used in this 

experiment are shown in SEQ ID NOS. 4 and 5, respectively 
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The buffer used contained 4 mM MgCJ 2 , 2mM DTT, 50 mM Tris HC1 (pH 7 6) 
10 mM NaCl and 1 nun KzPOj (100 ^1) 
Preparation of Slides 

Silica slides were Treated with decan for at least 24 hours and rinsed in water and 
5 EtOH directly before use. The dried slides were placed in a 50 ml solution of 2% 
glycidoxypropylrnmeihoxysilane in EtGH/H 2 SO„ (2 drops/500 mi) at room temperature 
for 2 boors. The slides were then rinsed in EtOH from a spray botile and dried under N 2 
The DNA samples (SEQ ID NO 4) were applied either as a 40- 1 00 pM solution 
(5 jd) in 1 0 mM K2PO3 pH 7 6 (allowed to dry overnight), or at least 1 \iM concentration 

1 0 over a sealed slide. The slides allowed to dry overnight were left over a layer of water 
for 18 hours at room temperature and then rmsed with milU-q (approx. 30 ml from a 
spray bottle) and dried under N 2 The sealed slides were simply flushed with 50 ml buffer 
prior to use Control slides with no coupled DNA were simply left under the buffer for 
identical time periods 

1 5 Enzyme Extensions on a Surface 

For the first incorporation cycle, samples were prepared with the buffer 
containing BSA (to 0.2 mg/mlX the triphosphate (Cy3dUTP; to 20 H MJ and the 
polymerase enzyme (T4 exo-; to 500 nM) (n certain experiments, the template DNA 
was also added at 2^M. The mixture was flowed into cells which were incubated at 

20 37 °C for 2 hours and flushed with 500 ml buffer. The second incorporation eyeje with 
Cy5dCTP (20 fiM), dATP (100 ^m) and dGTP (100 j*M) was performed in the same 
way The cells were flushed with 50 ml buffer and left for 12 hours prior to imaging. 
Control reactions were performed as above with, a) no DNA coupled prior to extension; 
b) DNA attached but no polymerase in the extension buffer, and c) DNA attached, but 

25 the polymerase denatured by boiling 
Reference Sample 

A reference sample, not immobilised to the surface, was prepared in the following 

way. 

Buffer containing 1 yM of the sample DNA BSA (0.2 rogtai), TMR-labelled . 
3 0 dUTP (20 yM) and the polymerase enzyme (T4 exo-, 500 nM; 100 jd) was prepared 
The reaction was analysed and purified by reverse phase HPLC (5-30% 
acetonitrile in ammonium acetate over 30 roin.) with U V and fluorescence detection In 
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all cases, the labelled DNA was clearly separate from both the unlabeled DNA and the 
labelled dNTP's The material was concentrated and dissolved in 10 raM KjPOa for 
analysis by A26Q and fluorescence The material purified by HPLC was farther extended 
with labelled dCTP (20 yMX dATP {100 jiM) and dGTP (100 pM) and HPLC purified 
5 again Surface coupling was then performed diy, at J 00 pM concentrations. 
Microscopic Analysis 

Following the single molecule DNA attachment procedure and extension 
reactions, the sample cells were analysed on a single molecule total internal reflection 
fluorescence microscope (TIRFM) in the following manner A 60 0 fused silica dispersion 

a 0 prism was coupled optically to the slide through an aperture in the cell via a thin film of 
glycerol Laser light was directed at the pnsm such that at the glass/sample interface ix 
subtends an angle of approximately 68° to the normal of the slide and subsequently 
undergoes total inrernal reflection The critical angle for a glass/water interface is 66° 
An evanescent field is generaied at the interface which penetrates only - 1 SO nm into the 

X 5 aqueous phase. Fluorescence from single molecules excited within this evanescent field 
is collected by a 100X objective lens of an inverted microscope, filtered spectrally from 
the laser light and imaged onto m Intensified Charge Coupled Device (JCCD) camera 
Two 90 j;m x 90 \xm images were recorded using a combination of. 1) 532 nm 
excitation (frequently doubled NdYAG) with a 580 nm interference fiher for Cy3 

2 0 detection, and 2) 630 nm excitation (&d. YAG pumped DCM dye laser) with a 670 nm 
filter for Cy 5 detection Images were recorded with an exposure time of 500 ms ar the 
maximum 1CCD gam of 5.75 counts/photoelectron Laser powers incident at the prom 
were 30 mW and 30 mW at 532 nm and 630 nm respectively Two colour fluorophore 
labelled nucleotide incorporations are identified by the co-localisation of discreet points 

2 5 of fluorescence from smgle molecules of Cy3 and Cy 5 following superimposing the two 

images Molecules are considered co-localised when fluorescent points are within apixel 
separation of each other. For a 90 ^m x 90 j*m field, projected onto a CCD array of 
512x512 pixels, the pixel si2e dimension is 0 176 |*m 
Results 

3 o The results of the experiment are shown in Table 1 The values shown are an 

average of ihe number of molecules imaged (Cy3 and Cy5) over all frames (100 in each) 
compiled in each experiment and the number of those molecules which are co-localised. 
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The find column represents the number of co-localised molecules expected if the two 
fluorophores were randomly dispersed across the sample slide (N~ tiAt where n is the 
surface density of molecules and Ar = G 1 76 is the minimum measurable separation) 
The number in brackets indicates the magnitude by which the level of co-locations in each 
5 experiment is greater than random 
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The percentage of co-localisation observed on this sample represents the 
maximum measurable for a dual labelled system, i e, there is a detection ceiling due to 
photophysical effects which means the level is not 100%. These effects may arise from 
interactions of the fluorophores with the D>JA or the surface or both. 
20 There is a statistically higher level of co-localisation in the incorporation 

experiments compared to the controls (8% versus 2% respectively) This shows that it 
is possible to perform enzymatic incorporation on the SMA and the level of incorporation 
is close to that of the reference sequence. Improvements in the surface attachment and 
the nature of the surface are required to increase the level of co-localisation in the 

2 s reference and to increase the detection efficiency of the enzymatic incorporation 

Example 4 

This Example illustrates the preparation of single molecule arrays by direct 
covalent attachment of hairpin loop structures to glass 

A solution of 1% glycidoxypropyltnmethoxy-silane in 95% ethanol/5% water 

3 o with 2 drops H 2 S0 4 per 500 ml was stirred for 5 minutes at room temperature Clean, 

dry Spectrosil-2000 slides (TSt, UK) were placed in the solution and the stirring 
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shopped After J hour the slides were removed, rinsed with ethanol, dried under N 2 and 
oven-cured for 30 mm ati00°C. These 'epoxide 5 modified slides were then Treated with 
1 of labelled DNA (5 , -Cy3-CTGCTGAAGCGTCGGCAGGT-beg-aniinodT-heg- 
ACCTGCCGaCGCT-3') (SEQ IP NOS. 6 and 7) in 50 mM potassium phosphate buffer, 
5 pH 7.4 for 18 hours at room temperature and, prior to analysis, flushed with 50 mM 
potassium phosphate, 1 mMEDTA, pH7.4. The coupling reactions were performed in 
sealed teflon blocks under a pre-moumed coverslip to prevent evaporation of the sample 
and allow direct imaging 

The DNA structure was designed as a self-primrag template system with an 

10 internal amino group attached as an amino deoxy-thymidine held by two IS atom 
hexaethylene glycol (heg) spacers, and was synthesised by conventional DNA synthesis 
techniques using phosphoramidite monomers. 

For imaging, one slide was inverted so that the chamber coverslip contacted the 
objective lens of an inverted microscope (Nikon TE200) via an immersion oil interface 

IS A 60* tused silica dispersion prism was coupled optically to the back of the slide through 
a thin film of glycerol Laser light was directed at ihe prism such that at the glass/sample 
inrerface it subtends an angle of approximately 68 a to the normal of the slide and 
subsequently undergoes Total internal Reflection (TJR) The critical angle for 
glass/water interface is 66 D . 

20 Fluorescence from single molecules of DNA-Cy3, produced by excitation with 

the surface-specific evanescent wave following TIR, was collected by the objective lens 
of the microscope and imaged onto an Intensified Charge Coupled Device {ICCD) 
camera (Pentamax, Princeton Instruments, NT). The image was recorded using a 532nra 
excitation (frequency-doubled solid-state Nd.YAG, Autares, Coherent) with a 5S0nm 

2 5 fluorescence (580DF30, Omega Optics, USA) per for Cy3 Images were recorded with 
an exposure time of 500ms at the maximum gain of 10 on the ICCD. Laser powers 
incident at the prism were 50mW at 532nm 

Single molecules were identified as described in Example 2 

The surface density of single molecules of DNA-Cy3 was measured at 

30 approximately 500 per 100 janx 100 j*m image or 5 x 10 6 cm 3 



